Phoneme-level Indexing for Fast and Vocabulary-independent Voice/voice Retrieval

نویسندگان

  • Alexandre Ferrieux
  • Stéphane Peillon
چکیده

This paper reports explorations on a novel approach for speech information retrieval with spoken queries. The method uses a two-layer decoding scheme, where the intermediary representation of speech is based on phonemes, which makes the system vocabularyindependent. Moreover, the use of synchronized lattices at this intermediary level is shown to improve the discriminative performance while decreasing the size of the parameter space, and with a very reasonable additional computational cost.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A hybrid word / phoneme-based approach for improved vocabulary-independent search in spontaneous speech

For efficient organization of speech recordings – meetings, interviews, voice mails, and lectures – being able to search for spoken keywords is essential. Today, most spoken document retrieval systems use large-vocabulary recognition. For the above scenarios, such systems suffer from the unpredictable domain, out-ofvocabulary queries, and generally high word-error rate (WER). In [1], we present...

متن کامل

Synthetic phoneme prototypes and dynamic voice source adaptation in speech recognition

A speech production oriented technique for generating reference spectral data for speech recognition is presented as an alternative to training to natural speech. The potentials of this approach are discussed. In the presented recognition system, the vocabulary and grammar are described as a finite-state network. Phoneme templates are specified in terms of control parameters to a cascade forman...

متن کامل

Synthetic phoneme prototypes and source adaptation in a speech recognition system

A recognition system based on a reference library of synthetic phoneme prototypes is described. The phoneme templates are specified in terms of formant synthesis parameters. The vocabulary and grammar is described in a finite-state network where each state represents a phoneme. A transition between two phonemes in the net is expanded to a number of new states using interpolation on the synthesi...

متن کامل

Teaming Up: Making the Most of Diverse Representations for a Novel Personalized Speech Retrieval Application

In addition to the increasing number of publicly available multimedia documents generated and searched every day, there is also a large corpora of personalized videos, images and spoken recordings, stored on users’ private devices and/or in their personal accounts in the cloud. Retrieving spoken items via voice commonly involves supervised indexing approaches such as large vocabulary speech rec...

متن کامل

Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems

This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999